Shallow morphology based complex predicates extraction in Oriya
نویسندگان
چکیده
This paper presents the extraction of Complex Predicates (CPs) in Oriya based on shallow morphology and available seed lists of verbs. Generally Oriya language is a free word order language. Free word order languages have relatively unrestricted local word group or phrase structures that make the problem of complex predicates extraction quite challenging. The complex predicates are generally the special multi word expression which is extracted with a special emphasis on compound verbs (Verb + Verb) and conjunct verbs (Noun /Adjective +Verb)/ (Verb + Noun /Adjective). The lexicalization of compound and conjunct verbs is done based on the information of shallow morphology. Lexical scopes of compound and conjunct verbs in consecutive sequence of Complex Predicates (CPs) have been identified. Aim of the current work is, to investigate the possibility of improving the accuracy of complex predicates extraction making it sensitive to verb sub categorization and to evaluate the recall, precision and Fscore on different operational environment.
منابع مشابه
Automatic Extraction of Complex Predicates in Bengali
This paper presents the automatic extraction of Complex Predicates (CPs) in Bengali with a special focus on compound verbs (Verb + Verb) and conjunct verbs (Noun /Adjective + Verb). The lexical patterns of compound and conjunct verbs are extracted based on the information of shallow morphology and available seed lists of verbs. Lexical scopes of compound and conjunct verbs in consecutive sequen...
متن کاملThe Interlanguage of Persian Learners of Italian: a Focus on Complex Predicates
This paper aims at investigating the acquisition of Italian complex predicates by native speakers of Persian. Complex predication is not as pervasive a phenomenon in Italian as it is in Persian. Yet Italian native speakers use complex predicates productively; spontaneous data show that Persian learners of Italian seem to be perfectly aware of Italian complex predicates and use this familiar fea...
متن کاملZone Based Relative Density Feature Extraction Algorithm for Unconstrained Handwritten Numeral Recognition
The recognition of handwritten digit recognition has been a challenging problem among the researchers for few decades. This paper proposes a relative density feature extraction algorithm for recognizing unconstrained single connected handwritten numerals independent of the languages. The proposed method consists of four phases, namely, image enhancement (dilation), representation (zone based), ...
متن کاملA New Method for Improving Computational Cost of Open Information Extraction Systems Using Log-Linear Model
Information extraction (IE) is a process of automatically providing a structured representation from an unstructured or semi-structured text. It is a long-standing challenge in natural language processing (NLP) which has been intensified by the increased volume of information and heterogeneity, and non-structured form of it. One of the core information extraction tasks is relation extraction wh...
متن کاملEvaluation of the NLP Components of an Information Extraction System for German
This paper describes ongoing work on the evaluation of the NLP components of the core engine of smes (Saarbrücker Message Extraction System), which consists of a tokenizer, an efficient and robust German morphology, a part-of-speech (POS) tagger, a shallow parsing module, a linguistic knowledge base and an output construction component. Currently the morphology, the tagger and a parsing module ...
متن کامل